AITopics | environment parameter

Collaborating Authors

environment parameter

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

24662461d2194d1bc70a47b6b6771026-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 13:47:15 GMT

Existing works mainly focus on arranging the levels to explicitly form a curriculum. In this work, we take a close look atthelearning process itself under themulti-leveltraining inProcgen.

justification, machine learning, reinforcement learning, (16 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.30)

Add feedback

EmergentComplexityandZero-shotTransfervia UnsupervisedEnvironmentDesign

Neural Information Processing SystemsFeb-9-2026, 11:27:15 GMT

Awide range ofreinforcement learning (RL) problems --including robustness, transfer learning, unsupervised RL, and emergent complexity -- require specifying a distribution of tasks or environments in which a policy will be trained.

artificial intelligence, arxivpreprintarxiv, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Industry: Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.47)

Add feedback

State Regularized Policy Optimization on Data with Dynamics Shift

Neural Information Processing SystemsDec-25-2025, 20:01:46 GMT

In many real-world scenarios, Reinforcement Learning (RL) algorithms are trained on data with dynamics shift, i.e., with different underlying environment dynamics. A majority of current methods address such issue by training context encoders to identify environment parameters. Data with dynamics shift are separated according to their environment parameters to train the corresponding policy.However, these methods can be sample inefficient as data are used \textit{ad hoc}, and policies trained for one dynamics cannot benefit from data collected in all other environments with different dynamics. In this paper, we find that in many environments with similar structures and different dynamics, optimal policies have similar stationary state distributions. We exploit such property and learn the stationary state distribution from data with dynamics shift for efficient data reuse. Such distribution is used to regularize the policy trained in a new environment, leading to the SRPO (\textbf{S}tate \textbf{R}egularized \textbf{P}olicy \textbf{O}ptimization) algorithm. To conduct theoretical analyses, the intuition of similar environment structures is characterized by the notion of homomorphous MDPs. We then demonstrate a lower-bound performance guarantee on policies regularized by the stationary state distribution. In practice, SRPO can be an add-on module to context-based algorithms in both online and offline RL settings.Experimental results show that SRPO can make several context-based algorithms far more data efficient and significantly improve their overall performance.

name change, state regularized policy optimization, stationary state distribution, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.59)

Add feedback

Robust Agents in Open-Ended Worlds

Samvelyan, Mikayel

arXiv.org Artificial IntelligenceDec-10-2025

The growing prevalence of artificial intelligence (AI) in various applications underscores the need for agents that can successfully navigate and adapt to an ever-changing, open-ended world. A key challenge is ensuring these AI agents are robust, excelling not only in familiar settings observed during training but also effectively generalising to previously unseen and varied scenarios. In this thesis, we harness methodologies from open-endedness and multi-agent learning to train and evaluate robust AI agents capable of generalising to novel environments, out-of-distribution inputs, and interactions with other co-player agents. We begin by introducing MiniHack, a sandbox framework for creating diverse environments through procedural content generation. Based on the game of NetHack, MiniHack enables the construction of new tasks for reinforcement learning (RL) agents with a focus on generalisation. We then present Maestro, a novel approach for generating adversarial curricula that progressively enhance the robustness and generality of RL agents in two-player zero-sum games. We further probe robustness in multi-agent domains, utilising quality-diversity methods to systematically identify vulnerabilities in state-of-the-art, pre-trained RL policies within the complex video game football domain, characterised by intertwined cooperative and competitive dynamics. Finally, we extend our exploration of robustness to the domain of LLMs. Here, our focus is on diagnosing and enhancing the robustness of LLMs against adversarial prompts, employing evolutionary search to generate a diverse range of effective inputs that aim to elicit undesirable outputs from an LLM. This work collectively paves the way for future advancements in AI robustness, enabling the development of agents that not only adapt to an ever-evolving world but also thrive in the face of unforeseen challenges and interactions.

evolutionary algorithm, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2512.08139

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Massachusetts (0.27)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Instructional Material (1.00)
Research Report > Promising Solution (0.65)

Industry:

Leisure & Entertainment > Sports > Motorsports > Formula One (1.00)
Leisure & Entertainment > Games > Computer Games (1.00)
Information Technology > Security & Privacy (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(4 more...)

Add feedback

Implicit Curriculum in Procgen Made Explicit Zhenxiong T an

Neural Information Processing SystemsOct-9-2025, 21:06:08 GMT

Existing works mainly focus on arranging the levels to explicitly form a curriculum.

agent, environment parameter, procgen, (16 more...)

Neural Information Processing Systems

Country:

Asia > Singapore (0.05)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

74953ef4abd9c436344e59d687ad34d3-Paper-Conference.pdf

Neural Information Processing SystemsSep-28-2025, 13:31:44 GMT

generator, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.66)

Industry:

Education (1.00)
Energy > Oil & Gas (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Towards Agent-based Test Support Systems: An Unsupervised Environment Design Approach

Ogbodo, Collins O., Rogers, Timothy J., Borgo, Mattia Dal, Wagg, David J.

arXiv.org Artificial IntelligenceAug-21-2025

Modal testing plays a critical role in structural analysis by providing essential insights into dynamic behaviour across a wide range of engineering industries. In practice, designing an effective modal test campaign involves complex experimental planning, comprising a series of interdependent decisions that significantly influence the final test outcome. Traditional approaches to test design are typically static-focusing only on global tests without accounting for evolving test campaign parameters or the impact of such changes on previously established decisions, such as sensor configurations, which have been found to significantly influence test outcomes. These rigid methodologies often compromise test accuracy and adaptability. To address these limitations, this study introduces an agent-based decision support framework for adaptive sensor placement across dynamically changing modal test environments. The framework formulates the problem using an underspecified partially observable Markov decision process, enabling the training of a generalist reinforcement learning agent through a dual-curriculum learning strategy. A detailed case study on a steel cantilever structure demonstrates the efficacy of the proposed method in optimising sensor locations across frequency segments, validating its robustness and real-world applicability in experimental settings.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

2508.14135

Country: Europe > United Kingdom (0.29)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.71)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.87)

Add feedback

Adversarial Environment Design via Regret-Guided Diffusion Models

Neural Information Processing SystemsAug-20-2025, 17:21:59 GMT

By incorporating the entropy term, we can ensure the diversity of the generated environments.

generator, machine learning, reinforcement learning, (19 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Industry:

Education (0.68)
Energy > Oil & Gas (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

CRED: Counterfactual Reasoning and Environment Design for Active Preference Learning

Tung, Yi-Shiuan, Hayes, Bradley, Roncone, Alessandro

arXiv.org Artificial IntelligenceJul-9-2025

For effective real-world deployment, robots should adapt to human preferences, such as balancing distance, time, and safety in delivery routing. Active preference learning (APL) learns human reward functions by presenting trajectories for ranking. However, existing methods often struggle to explore the full trajectory space and fail to identify informative queries, particularly in long-horizon tasks. We propose CRED, a trajectory generation method for APL that improves reward estimation by jointly optimizing environment design and trajectory selection. CRED "imagines" new scenarios through environment design and uses counterfactual reasoning -- by sampling rewards from its current belief and asking "What if this reward were the true preference?" -- to generate a diverse and informative set of trajectories for ranking. Experiments in GridWorld and real-world navigation using OpenStreetMap data show that CRED improves reward learning and generalizes effectively across different environments.

machine learning, reinforcement learning, trajectory, (18 more...)

arXiv.org Artificial Intelligence

2507.05458

Country: North America > United States > Colorado > Boulder County > Boulder (0.04)

Genre: Research Report (1.00)

Technology: